Depth Control of Model-Free AUVs via Reinforcement Learning

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Depth Control of Model-Free AUVs via Reinforcement Learning

In this paper, we consider depth control problems of an autonomous underwater vehicle (AUV) for tracking the desired depth trajectories. Due to the unknown dynamical model of the AUV, the problems cannot be solved by most of modelbased controllers. To this purpose, we formulate the depth control problems of the AUV as continuous-state, continuous-action Markov decision processes (MDPs) under un...

متن کامل

Reinforcement Learning: Model-free

Simply put, reinforcement learning (RL) is a term used to indicate a large family of dierent algorithms RL that all share two key properties. First, the objective of RL is to learn appropriate behavior through trialand-error experience in a task. Second, in RL, the feedback available to the learning agent is restricted to a reward signal that indicates how well the agent is behaving, but does ...

متن کامل

Multitask model-free reinforcement learning

Conventional model-free reinforcement learning algorithms are limited to performing only one task, such as navigating to a single goal location in a maze, or reaching one goal state in the Tower of Hanoi block manipulation problem. It has been thought that only model-based algorithms could perform goal-directed actions, optimally adapting to new reward structures in the environment. In this wor...

متن کامل

Model-Free Trajectory Optimization for Reinforcement Learning

Many of the recent Trajectory Optimization algorithms alternate between local approximation of the dynamics and conservative policy update. However, linearly approximating the dynamics in order to derive the new policy can bias the update and prevent convergence to the optimal policy. In this article, we propose a new model-free algorithm that backpropagates a local quadratic time-dependent Q-F...

متن کامل

Model-Free Preference-Based Reinforcement Learning

Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tuning from a human expert. In contrast, preference-based reinforcement learning (PBRL) utilizes only pairwise comparisons between trajectories as a feedback signal, which are often more intuitive to specify. Currently available approaches to PBRL for control problems with continuous state/action sp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Systems, Man, and Cybernetics: Systems

سال: 2019

ISSN: 2168-2216,2168-2232

DOI: 10.1109/tsmc.2017.2785794